Clustering Uncertain Data Objects Using Jeffreys-Divergence and Maximum Bipartite Matching Based Similarity Measure
نویسندگان
چکیده
In recent years, uncertain data clustering has become the subject of active research in many fields, for example, pattern recognition, and machine learning. Nowadays, researchers have committed themselves to substitute traditional distance or similarity measures with new metrics existing centralized algorithms order tackle uncertainty data. However, perform clustering, representation plays an imperative role. this paper, a Monte-Carlo integration is adopted modified express probabilistic form. Then three are used determine closeness between two probability distributions including one novel measure. These derived from notion Kullback-Leibler divergence Jeffreys divergence. Finally, density-based spatial applications noise k-medoids implemented on synthetic database real-world databases. The obtained outcomes confirm that proposed technique defeats some algorithms.
منابع مشابه
An Efficient Divergence and Distribution Based Similarity Measure for Clustering Of Uncertain Data
Data Mining is the extraction of hidden predictive information from large databases. Clustering is one of the popular data mining techniques. Clustering on uncertain data, one of the essential tasks in mining uncertain data, posts significant challenges on both modeling similarity between uncertain objects and developing efficient computational methods. The previous methods extend traditional p...
متن کاملImproving Imbalanced data classification accuracy by using Fuzzy Similarity Measure and subtractive clustering
Classification is an one of the important parts of data mining and knowledge discovery. In most cases, the data that is utilized to used to training the clusters is not well distributed. This inappropriate distribution occurs when one class has a large number of samples but while the number of other class samples is naturally inherently low. In general, the methods of solving this kind of prob...
متن کاملOccurrence Based Categorical Data Clustering Using Cosine and Binary Matching Similarity Measure
Clustering is the process of grouping a set of physical objects into classes of similar object. Objects in real world consist of both numerical and categorical data. Categorical data are not analyzed as numerical data because of the absence of inherit ordering. This paper describes about occurrence based categorical data clustering (OBCDC) technique based on cosine similarity measure and simple...
متن کاملTechnique For Clustering Uncertain Data Based On Probability Distribution Similarity
: Clustering on uncertain data, one of the essential tasks in data mining. The traditional algorithms like K-Means clustering, UK Means clustering, density based clustering etc, to cluster uncertain data are limited to using geometric distance based similarity measures and cannot capture the difference between uncertain data with their distributions. Such methods cannot handle uncertain objects...
متن کاملClustering of Uncertain Data Objects using Improved K-means Algorithm
Recently data mining over the uncertain data attracts more attention of the data mining. The uncertainty occurs in a information because of the inaccurate measurement of the results, like scientific results, data gathered from sensor network, measuring temperature, humidity, pressure and so on. from such a sources there is possibility of getting the uncertainty in a data. Main task is to handle...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Access
سال: 2021
ISSN: ['2169-3536']
DOI: https://doi.org/10.1109/access.2021.3083969